How We Recovered from a Major System Failure in 72 Hours | Pivots with Alex Palatnick
Description
Alex Palatnick shares a high-stakes story of a critical system failure at 51 Mines and the tough pivot his team had to make to recover. When an ISIS array started acting up, the solution required a full deletion and rebuild—an intense decision that had the entire team working tirelessly to restore operations within 72 hours. Alex reflects on the challenges, the importance of quick decision-making, and the lessons learned from navigating technical crises.Key Topics:The unexpected failure of an ISIS array and its impactHow the engineering team assessed the situationThe critical decision to delete and rebuild the systemThe importance of backing up data on LTOThe recovery process and lessons from the experienceQuotes:"The Pivot was as simple as making the decision—delete the array, bring it back up again, and start restoring everything.""Everybody was back working again within 72 hours, but it was ugly. Ugly. Wasn't any fun.""We took a real hard look at it, made sure all the media was backed up, and pulled the trigger on the fix."See more https://www.indexrgb.com/#TechRecovery #CrisisManagement #PivotAndRestore























